42 research outputs found
vsgoftest: An R Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence
The R package vsgoftest performs goodness-of-fit (GOF) tests, based on Shannon entropy and Kullback-Leibler divergence, developed by Vasicek (1976) and Song (2002), of various classical families of distributions. The so-called Vasicek-Song (VS) tests are intended to be applied to continuous data - typically drawn from a density distribution, even including ties. Their excellent properties - they exhibit high power in a large variety of situations, make them relevant alternatives to classical GOF tests in any domain of application requiring statistical processing. The theoretical framework of VS tests is summarized and followed by a detailed description of the different features of the package. The power and computational time performances of VS tests are studied through their comparison with other GOF tests. Application to real datasets illustrates the easy-to-use functionalities of the vsgoftest package
äžèŹé«ćžïŒé«çèçœź
International audienceThis paper mainly aims at unifying as a unique goodness-of-fit procedure the tests based on Shannon entropyâcalled S-testsâintroduced by Vasicek in 1976, and the tests based on relative entropyâor Kullback-Leibler divergence, called KL-testsâintroduced by Song in 2002. While Vasicekâs procedure is widely used in the literature, Songâs has remained more confidential. Both tests are known to have good power properties and to lead to straightforward computations. However, some asymptotic properties of the S-tests have never been checked and the link between the two procedures has never been highlighted. Mathematical justification of both tests is detailed here, leading to show their equivalence for testing any parametric composite null hypothesis of maximum entropy distributions. For testing any other distribution, the KL-tests are still reliable goodness-of-fit tests, whereas the S-tests become tests of entropy level. Moreover, for simple null hypothesis, only the KL-tests can be considered. The methodology is applied to a real dataset of a DNA replication process, issued from a collaboration with biologists. The objective is to validate an experimental protocol to detect chicken cell lines for which the spatiotemporal program of DNA replication is not correctly executed. We propose a two-step approach through entropy-based tests. First, a Fisher distribution with non integer parameters is retained as reference, and then the experimental protocol is validated
How Variation of Scores of the Program for International Student Assessment Can Be Explained Through Analysis of Information
International audienc
How Variation of Scores of the Program for International Student Assessment Can Be Explained Through Analysis of Information
International audienc
How Variation of Scores of the Program for International Student Assessment Can Be Explained Through Analysis of Information
International audienc
vsgoftest: An R Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence
The R-package vsgoftest performs goodness-of-fit (GOF) tests, based on Shannon en-tropy and Kullback-Leibler divergence, developed by Vasicek (1976) and Song (2002), of various classical families of distributions. The theoretical framework of the so-called Vasicek-Song (VS) tests is summarized and followed by a detailed description of the different features of the package. The power and computational time performances of VS tests are studied through their comparison with other GOF tests. Application to real datasets illustrates the easy-to-use functionalities of the vsgoftest package
Information-Based Parametrization of Log-Linear Models for Categorical Data Analysis
International audienceZighera (App Stoch Mod Data Anal 1:93â108 1985) introduced a new parameterization of log-linear models for analyzing categorical data, directly linked to a thorough analysis of discrimination information through Kullback-Leibler divergence. The method mainly aims at quantifying in terms of information the variations of a binary variable of interest, by comparing two contingency tables â or sub-tables â through effects of explanatory categorical variables. The present paper settles the mathematical background necessary to rigorously apply Zigheraâs parameterization to any categorical data. In particular, identifiability and good properties of asymptotically Ï 2-distributed test statistics are proven to hold. Determination of parameters and all tests of effects due to explanatory variables are simultaneous. Application to classical data sets illustrates contribution with respect to existing methods
Sur la recherche de Ï-entropie Ă maximisante donnĂ©e
National audienceIn this paper, we are interested in maximum entropy problems under moment constraints. Contrary to the usual problem of finding the maximizer of a given entropy, or of selecting constraints such that a given distribution is a maximizer, we focus here on the determination of an entropy such that a given distribution is its maximizer. The goal is in some sense to adapt the entropy to its maximizer, with potential application in entropy-based goodness-of-fit tests. It allows us to consider distributions out the exponential family â to which the maximizers of the Shannon entropy belong, and also to consider simple moment constraints, estimated from the observed sample. Finally, this approach also yields entropic functionals that are function of both probability density and state, allowing us to include skew-symmetric or multimodal distributions in the setting.Nous nous intĂ©ressons ici au problĂšme de lois dâentropie maximum, sous contraintes de moments. Contrairement au problĂšme usuel de recherche de maximisante dâune entropie donnĂ©e, ou de contraintes pour quâune loi fixĂ©e soit maximisante, nous considĂ©rons la recherche de lâentropie elle-mĂȘme telle quâune loi donnĂ©e en soit sa maximisante. Il sâagit en quelques sorte dâadapter lâentropie Ă la maximisante. Cette approche trouve potentiellement des applications dans les problĂšmes de tests dâadĂ©quation basĂ©s sur des critĂšres entropiques. Elle permet de sortir du cadre des lois de la famille exponentielle, correspondant aux maximisantes de lâentropie de Shannon, et Ă©galement de se limiter Ă des contraintes de moment simples, en pratique estimĂ©s Ă partir de lâĂ©chantillon observĂ©. Cette approche nous conduit enfin Ă dĂ©finir des fonctionnelles entropiques fonction Ă la fois de la densitĂ© de probabilitĂ© et de lâĂ©tat, permettant de traiter des lois non symĂ©triques ou multimodales
Sur la recherche de Ï-entropie Ă maximisante donnĂ©e
National audienceIn this paper, we are interested in maximum entropy problems under moment constraints. Contrary to the usual problem of finding the maximizer of a given entropy, or of selecting constraints such that a given distribution is a maximizer, we focus here on the determination of an entropy such that a given distribution is its maximizer. The goal is in some sense to adapt the entropy to its maximizer, with potential application in entropy-based goodness-of-fit tests. It allows us to consider distributions out the exponential family â to which the maximizers of the Shannon entropy belong, and also to consider simple moment constraints, estimated from the observed sample. Finally, this approach also yields entropic functionals that are function of both probability density and state, allowing us to include skew-symmetric or multimodal distributions in the setting.Nous nous intĂ©ressons ici au problĂšme de lois dâentropie maximum, sous contraintes de moments. Contrairement au problĂšme usuel de recherche de maximisante dâune entropie donnĂ©e, ou de contraintes pour quâune loi fixĂ©e soit maximisante, nous considĂ©rons la recherche de lâentropie elle-mĂȘme telle quâune loi donnĂ©e en soit sa maximisante. Il sâagit en quelques sorte dâadapter lâentropie Ă la maximisante. Cette approche trouve potentiellement des applications dans les problĂšmes de tests dâadĂ©quation basĂ©s sur des critĂšres entropiques. Elle permet de sortir du cadre des lois de la famille exponentielle, correspondant aux maximisantes de lâentropie de Shannon, et Ă©galement de se limiter Ă des contraintes de moment simples, en pratique estimĂ©s Ă partir de lâĂ©chantillon observĂ©. Cette approche nous conduit enfin Ă dĂ©finir des fonctionnelles entropiques fonction Ă la fois de la densitĂ© de probabilitĂ© et de lâĂ©tat, permettant de traiter des lois non symĂ©triques ou multimodales